Using Modulation Spectrum as a Metric to Quantify Over-Smoothing Effects in Statistical Parametric Speech Synthesis
نویسندگان
چکیده
Interactive realtime tools for education were developed making use of Matlab GUI functions. They are a) a realtime spectrogram with narrowband, wideband and perceptual frequency resolution axes, b) a realtime spectrum monitoring followed by an interactive spectrogram display with sound playback of a rectangular time-frequency region and scrubbing inspection, c) realtime vocal tract shape display of input sounds, and d) realtime F0 extraction with event detection and display. In addition to these tools, animation generator functions are prepared for understanding fundamentals of Fourier transform and digital signal processing.
منابع مشابه
Pulse density representation of spectrum for statistical speech processing
This study investigates a new spectral representation that is suitable for statistical parametric speech synthesis. Statistical speech processing involves spectral averaging in the training process; however, averaging spectra in the domain of conventional speech parameters over-smooths the resulting means, which degrades the quality of the speech synthesised. In the proposed representation, hig...
متن کاملRefined inter-segment joining in multi-form speech synthesis
In multi-form speech synthesis, speech output is constructed by splicing waveform segments and parametric speech segments which are generated from statistical models. The decision whether to use the waveform or the statistical parametric form is made per segment. This approach faces certain challenges in the context of inter-segment joining. In this work, we present a novel method whereby all n...
متن کاملStudy on Unit-Selection and Statistical Parametric Speech Synthesis Techniques
One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...
متن کاملModulation spectrum-constrained trajectory training algorithm for HMM-based speech synthesis
This paper presents a novel training algorithm for Hidden Markov Model (HMM)-based speech synthesis. One of the biggest issues causing significant quality degradation in synthetic speech is the over-smoothing effect often observed in generated speech parameter trajectories. Recently, we have found that a Modulation Spectrum (MS) of the generated speech parameters is sensitively correlated with ...
متن کاملSub-band text-to-speech combining sample-based spectrum with statistically generated spectrum
As described in this paper, we propose a sub-band speech synthesis approach to develop a high quality Text-to-Speech (TTS) system: a sample-based spectrum is used in the high-frequency band and spectrum generated by HMM-based TTS is used in the low-frequency band. Herein, sample-based spectrum means spectrum selected from a phoneme database such that it is the most similar to spectrum generated...
متن کامل